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Abstract 

In the classic fc-centcr problem, we are given a metric graph, and the objective is to select 
k nodes as centers such that the maximum distance from any vertex to its closest center is 
minimized. In this paper, we consider two important generalizations of /c-center, the matroid 
center problem and the knapsack center problem. Both problems are motivated by recent 
content distribution network applications. Our contributions can be summarized as follows: 

1. We consider the matroid center problem in which the centers are required to form an 
independent set of a given matroid. We show this problem is NP-hard even on a line. We 
present a 3-approximation algorithm for the problem on general metrics. We also consider 
the outlier version of the problem where a given number of vertices can be excluded as 
outliers from the solution. We present a 7-approximation for the outlier version. 

2. We consider the (multi-) knapsack center problem in which the centers are required to 
satisfy one (or more) knapsack constraint (s). It is known that the knapsack center problem 
with a single knapsack constraint admits a 3-approximation. However, when there are at 
least two knapsack constraints, we show this problem is not approximable at all. To 
complement the hardness result, we present a polynomial time algorithm that gives a 3- 
approximate solution such that one knapsack constraint is satisfied and the others may be 
violated by at most a factor of 1 + e. We also obtain a 3-approximation for the outlier 
version that may violate the knapsack constraint by 1 -f e. 
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1 Introduction 



The /c-center problem is a fundamental facility location problem. In the basic version, we are given 
a metric space {V, d) and are asked to locate a set 5 C y of at most k vertices as centers and to 
assign the other vertices to the centers, so as to minimize the maximum distance from any vertex 
to its assigned center, or more formally, to minimize max„gy min^^g^ d{v, u). In the demand version 
of the /c-center problem, each vertex v has a positive demand r{v), and our goal is to minimize the 
maximum weighted distance from any vertex to the centers, i.e., max^gy min^g^ r(t;)(i(w, u). It is 
well known that the fe-center problem is NP-hard and admits a polynomial time 2-approximation 
even for the demand version OUT], and that no polynomial time (2 — e)-approximation algorithm 
exists unless P = NP [14J. 

In this paper, we conduct a systematic study on two generalizations of the /j-center problem 
and their variants. The first one is the matroid center problem, denoted by MatCenter, which is 
almost the same as the A:-center problem except that, instead of the cardinality constraint on the 
set of centers, now the centers are required to form an independent set of a given matroid. A finite 
matroid is a pair {V,I), where V is a finite set (called the ground set) and I is a collection of 
subsets of V. Each element in X is called an independent set. Moreover, A4 = {V,I) satisfies the 
foUowing three properties: (1) G X; (2) if A C S and Bel, then A £ I; (3) for all A,B el with 
\A\ > \B\, there exists an element e € A\B such that B U {e} G I. Following the conventions in 
the literature, we assume the matroid A4 is given by an independence oracle which, given a subset 
5 C y, decides whether S el. For more information about the theory of matroids, see, e.g., [29] . 

The second problem we study is the knapsack center problem (denoted as KnapCenter), another 
generalization of A:-center in which the chosen centers are subject to (one or more) knapsack con- 
straints. More formally, in KnapCenter, there are m nonnegative weight functions wi, . . . ,Wm on 
V, and m weight budgets Bi, . . . ,Bm- Let 'Wi{V') := X^^gy/ Wi{v) for all V' C V . A solution takes 
a set of vertices SCVas centers such that Wi{S) < Bi for all 1 < i < m. The objective is still to 
minimize the maximum service cost of any vertex in V (the service cost of v equals miucg^ d{v, c), 
or milieus r{v)d{v,c) in the demand version). In this paper, we are interested only in the case 
where the number m of knapsack constraints is a constant. We note that the special case with only 
one knapsack constraint was studied in [TH] under the name of weighted /c-center, which already 
generalizes the basic /c-center problem. 

Both MatCenter and KnapCenter are motivated by important applications in content distribution 
networks [16^ [22] . In a content distribution network, there are several types of servers and a set 
of clients to be connected to the servers. Often there is a budget constraint on the number of 
deployed servers of each type pLBj. We would like to deploy a set of servers subject to these budget 
constraints in order to minimize the maximum service cost of any client. The budget constraints 
correspond to finding an independent set in a partition matroidQ We can also use a set of knapsack 
constraints to capture the budget constraints for all types (we need one knapsack constraint for 
each type). Motivated by such applications, Hajiaghayi et al. [16] first studied the red-blue median 
problem in which there are two types (red and blue) of facilities, and the goal is to deploy at most 
kr red facilities and kf, blue facilities so as to minimize the sum of service costs. Subsequently, 
Krishnaswamy et al. [22\ introduced a more general matroid median problem which seeks to select 

* Let Bi, B2, ■ ■ ■ , Bt be a collection of disjoint subsets of V and di be integers such that 1 < di < for all 
1 < i < b. We say a set 7 C 1/ is independent if |7ni3i| < di for 1 < i < fe. All such independent sets form a partition 
matroid. 
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a set of facilities that is an independent set in a given matroid and the knapsack median problem in 
which the set of facilities must satisfy a knapsack constraint. The work mentioned above uses the 
sum of service costs as the objective (the /c-median objective), while our work aims to minimize the 
maximum services cost (the fe-center objective), which is another popular objective in the clustering 
and network design literature. 

1.1 Our Results 

For MatCenter, we show the problem is NP-hard to approximate within a factor of 2 — e for any 
constant e > 0, even on a line. Note that the /c-center problem on a line can be solved exactly in 
polynomial time [5]. We present a 3-approximation algorithm for MatCenter on general metrics. 
This improves the constant factors implied by the approximation algorithms for matroid median 
[22113] (see Section O for details). 

Next, we consider the outlier version of MatCenter, denoted as Robust-MatCenter, where one 
can exclude at most n — p nodes as outliers. We obtain a 7-approximation for Robust-MatCenter. 
Our algorithm is a nontrivial generalization of the greedy algorithm of Charikar et al. [2], which 
only works for the outlier version of the basic /c-center. However, their algorithm and analysis do 
not extend to our problem. In their analysis, if at least p nodes are covered by k disks (with radius 
3 times OPT), they have found a set of k centers and obtained a 3-approximation. However, in 
our case, we may not be able to open enough centers in the covered region, due to the matroid 
constraint. Therefore, we need to search for centers globally. To this end, we carefully construct 
two matroids and argue that their intersection provides a desirable answer (the construction is 
similar to that for the non-outlier version, but more involved). 

We next deal with the KnapCenter problem. We show that for any / > 0, the existence of 
an /-approximation algorithm for KnapCenter with more than one knapsack constraint implies 
P = NP. This is a sharp contrast with the case with only one knapsack constraint, for which a 3- 
approximation exists [18J and is known to be optimal [7]. Given this strong inapproximability result, 
it is then natural to ask whether efficient approximation algorithms exist if we are allowed to slightly 
violate the constraints. We answer this question affirmatively. We provide a polynomial time 
algorithm that, given an instance of KnapCenter with a constant number of knapsack constraints, 
finds a 3-approximate solution that is guaranteed to satisfy one constraint and violate each of the 
others by at most a factor of 1 -|- e for any fixed e > 0. This generalizes the result of [18] to the 
multi-constraint case. Our algorithm also works for the demand version of the problem. 

We then consider the outlier version of the knapsack center problem, which we denote by 
Robust-KnapCenter. We present a 3-approximation algorithm for Robust-KnapCenter that violates 
the knapsack constraint by a factor of 1 -|- e for any fixed e > 0. Our algorithm can be regarded as 
a "weighted" version of the greedy algorithm of Charikar et al. [2] which only works for the unit- 
weight case. However, their charging argument does not apply to the weighted case. We instead 
adopt a more involved algebraic approach to prove the performance guarantee. We translate our 
algorithm into inequalities involving point sets, and then directly manipulate the inequalities to 
establish our desired approximation ratio. The total weight of our chosen centers may exceed the 
budget by the maximum weight of any client, which can be turned into a 1 -|- e multiplicative factor 
by the partial enumeration technique. We leave open the question whether there is a constant 
factor approximation for Robust-KnapCenter that satisfies the knapsack constraint. 
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1.2 Related Work 



For the basic A;-ceiiter problem, Hochbaum and Shmoys [17^ [T8] and Gonzalez [T3| developed 2- 
approximation algorithms, which are the best possible if P 7^ NP |14j . The former algorithms are 
based on the idea of the threshold method, which originates from [TO]. On some special metrics 
like the shortest path metrics on trees, fc-center (with or without demands) can typically be solved 
in polynomial time by dynamic programming. By exploring additional structures of the metrics, 
even linear or quasi-linear time algorithms can be obtained; see e.g. [5l |8l [11] and the references 
therein. Several generalizations and variations of fe-center have also been studied in a variety of 
application contexts; see, e.g. [Tl [2511201 HII91I2T] . 

A problem closely related to fc-center is the well-known fc-median problem, whose objective 
is to minimize the sum of service costs of all nodes instead of the maximum one. Hajiaghayi et 
al. [16] introduced the red-blue median problem that generalizes fe-median, and presented a constant 
factor approximation based on local search. Krishnaswamy et al. [22j introduced the more general 
matroid median problem and presented a 16-approximation algorithm based on LP rounding, whose 
ratio was improved to 9 by Charikar and Li [3] using a more careful rounding scheme. Another 
generalization of /c- median is the knapsack median problem studied by Kumar [23] , which requires 
to open a set of centers with a total weight no larger than a specified value. Kumar gave a (large) 
constant factor approximation for knapsack median, which was improved by Charikar and Li [3] 
to a 34-approximation. Several other classical problems have also been investigated recently under 
matroid or knapsack constraints, such as minimum spanning tree [32], maximum matching [15j . 
and sub modular maximization [2^ [30] . 

For the fc-center formulation, it is well known that a few distant vertices (outliers) can dis- 
proportionately affect the final solution. Such outliers may significantly increase the cost of the 
solution, without improving the level of service to the majority of clients. To deal with outliers, 
Charikar et al. [2J initiated the study of the robust versions of fc-center and other related problems, 
in which a certain number of points can be excluded as outliers. They gave a 3-approximation 
for robust /c-center, and showed that the problem with forbidden centers (i.e., some points cannot 
be centers) is inapproximable within 3 — e unless P = NP. For robust /c-median, they presented 
a bicriteria approximation algorithm that returns a 4(1 + l/e)-approximate solution in which the 
number of excluded outliers may violate the upper bound by a factor of 1 + e. Later, Chen [6] 
gave a truly constant factor approximation (with a very large constant) for the robust /c-median 
problem. McCutchen and Khuller [26] and Zarrabi-Zadeh and Mukhopadhyay [3T] considered the 
robust /c-center problem in a streaming context. 

2 The Matroid Center Problem 

In this section, we consider the matroid center problem and its outlier version. A useful ingredient 
of our algorithms is the (weighted) matroid intersection problem defined as follows. We are given 
two matroids M.i{V,Ii) and ^^2(^,2^2) defined on the same ground set V. Each element v G V 
has a weight w^v) > 0. The goal is to find a common independent set S in the two matroids, i.e., 
S € Ii nl2, such that the total weight w{S) = Ylves '"^(''^) is maximized. It is well known that this 
problem can be solved in polynomial time (e.g., see [29]). 
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2.1 NP-hardness of Matroid Centers on a Line 



In contrast to the basic A;-center problem on a line which can be solved in near-linear time |5], we 
show that MatCenter is NP-hard even on a line. We actually prove the following stronger theorem. 

Theorem 1. It is NP-ftard to approximate MatCenter on a line within a factor strictly better than 
2, even when the given matroid is a partition matroid. 

Proof. In a partition matroid, each element in the ground set is colored using one of the h colors 
and we are given h integers bi,b2, . . . ,bh. The collection of all independent sets is defined to be all 
subsets that contain at most bi elements of color 1, at most 62 elements of color 2, and so on. 
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Figure 1: A variable gadget and a clause gadget. 



We use the 3SAT problem for the reduction. Without loss of generality, we assume that each 
literal (including all variables Xi and their negation Xi) appears exactly four times in the 3DNF. 
Given a 3DNF, we create a MatCenter instance as follows. The points appear in groups. Each 
group consists of r (r > 3) points with r — 2 points in the middle, one to the left and one to the 
right. The left and right points are 1 unit distance away from the midpoints. Different groups are 
very far away from each other. Therefore, in order to make the maximum radius at most one, we 
need to either select one of the midpoints in each group or select at least the two points not in 
the middle. For each variable Xi, we create a variable gadget as follows. The gadget consists of 6 
groups, each having 3 points: 

{phpli,p)i), {qhqli,q}i), {p\\p2i,pi'), ipf^p'ii^pf), iQ^Q^ii^qi'), {<i,^^,€)- 

For two points p and q, we use [p, q] to indicate that we assign a new color to p and q. The color 
assignment for the gadget is defined by the following pairs: 

We are allowed to choose at most one point as a center from each color class. Points Pi^,P2^,p^^,p^^ 
are called positive portals of Xj and points qf^ , q"!^ , q"^^ , qf are called negative portals of 
Figured] for an example. For each clause, we create a clause gadget, which is a group of 5 points. 
We have 3 points in the middle (co- located at the same place) , each corresponding to a literal in the 
clause. If the point corresponds to a positive (negative) literal, say Xi (or Xj), the point is paired 
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with one of the positive (negative) portals of Xi and we assign the pair a new color. We also require 
that at most one point can be chosen as a center in this pair. Each portal can be paired at most 
once. Since each literal appears exactly 4 times, we have enough portals for the clause gadgets. 
All the left and right points of all clause gadgets have the same color but we are allowed to choose 
none of them as centers. 

We can show that the optimal radius for the MatCenter instance is 1 if and only if the 3DNF 
formula is satisfiable. First, suppose the 3DNF is satisfiable. If is TRUE in a truth assign- 
ment, then we pick p\,f,p^M^P^M Pi^,P2^,P^s,pf as centers. Otherwise, we pick qM,qM,qM ^^'^ 
Qi^j Q2^jQ^jQ^i as centers. It is straightforward to verify the independence property. For each group, 
at least one of the midpoints is selected. Thus, the optimal solution is 1. Given the correspondence, 
the reverse direction can be proved similarly and we omit it. □ 

2.2 A 3- Approximation for MatCenter 

In fact, we can obtain a constant approximation for MatCenter by using the constant approximation 
for the matroid median problem 122^13]. which roughly gives a 9- approximation for MatCenter. The 
idea is given below. 

We say a space V with a distance function d satisfies the {\,c) -relaxed triangle inequality (TI) 
for some A and c, if d{aQ,ac) < A ^^^^ d(ai_i, a,) for all ao,ai, . . . ,ac G V. (Thus a metric 
space satisfies the (1, c)-relaxed TI for all c > 1.) By examining the algorithms in [221 [3] for the 
matroid median problem, we notice that they can actually give a (;uA)-approximation for matroid 
median where /i is some universal constant, if the underlying space satisfies the (A, co)-relaxed TI 
for some algorithm-dependent cqJI (Roughly speaking, cq is the maximum number of times that 
the triangle inequality is used for bounding the distance between a client and a facility.) Now, 
given an instance of MatCenter with metric space {V,d), we define a new distance function d' as 
d'(a, b) = {d{a, b)Y for all a, 6 G V, where p > 2 is a parameter whose value will be specified later. 
By the convexity of the function f{x) = x^ when p > 2, for all c > 1 and ao,ai, . . . ,ac & V, we 
have (J2i=id{ai-i,ai)/cy < YTi=id{a'i-i,aiY / c, and thus 

c 

d'{ao,ac) = d{aQ,acY <C^d{ai-i,ai)Y 

i=l 

c c 
i=l i=l 

Therefore (^, d') satisfies the (c^~^, c)-relaxed TI for all c > 1. In particular, it satisfies the 
(cq~^, co)-relaxed TI where cq is the algorithm-dependent parameter mentioned before. We now 
solve the matroid median problem on the instance with the new distance function d'. Let OPT 
denote the optimal objective value of MatCenter on the original instance. Then it is clear that 
the optimal cost of matroid median on the new instance is at most \y\ ■ OPT'^. By our previ- 
ous observation, the algorithms of ^22i [3j give a solution of cost at most /iCQ~^|y|OPT^. Trans- 
forming the distance function back to d, the maximum service cost of any client is at most 

^We note that Golovin et al. [13| claimed (without a proof) that, in our notations, most existing approximation 
algorithms for fc-median achieve an 0(A)-approximation on spaces satisfying (A, 2)-relaxed TI. By a scrutiny of the 
existing fc-median algorithms, we are not able to reproduce the same result and the correct approximation ratio 
should be roughly 0(A'^''). However, the results of [13] are not affected in any essential way since this only changes 
the constant hidden in the big-oh notation. 
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Algorithm 1: Algorithm for MatCenter on Gi 

1 Initially, C -i^ and mark all vertices in V as uncovered. 

2 while V contains uncovered vertices do 

3 Pick an uncovered vertex v. Set B(f ) ^ B(f , d{ei)) and C ^ C U {w}. 

4 Mark all vertices in B(u,2d(ej)) as covered. 

5 end 

6 Define a partition matroid = {V,I) with partition {{B{v)}v^c ,V \ U^gcB(w)} (note 
that {B{v)}^^c are disjoint sets by Lemma[l]), where X is the set of subsets of V that 
contains at most 1 element from every B{v) and element from V \ Ut,gcB(f). 

7 Solve the unweighted (or, unit- weight) matroid intersection problem between and Ai to 
get an optimal intersection S. If |5| < |C|, then we declare a failure and try the next Gi. 
Otherwise, we succeed and return S as the set of centers. 



(^c^"V|OPTP)Vp = cJ~^/P(/i|y|)i/POPT. By choosing p = n{\V\), this can produce a (cq + e)- 
approximation for MatCenter for any fixed e > 0. Using the algorithm of [3] this roughly gives a 
9-approximation. 

We next present a 3-approximation for MatCenter, thus improving the ratio derived from the 
matroid median algorithms \22\ [3j. Also, compared to their LP-based algorithms, ours is simpler, 
purely combinatorial, and easy to implement. We begin with the description of our algorithm. 
Regard the metric space as a (complete) graph G = {V, E) where each edge {u, v} has length 
d{u,v). Let B{v,r) be the set of vertices that are at most r unit distance away from v (it depends 
on the underlying graph). Let ei, 62, . . . , e\E\ be the edges in a non-decreasing order of their lengths. 
We consider each spanning subgraph Gi of G that contains only the first i edges, i.e., Gi = {V,Ei) 
where Ei = {ei, . . . , Cj}. We run Algorithm [1] on each Gi and take the best solution. 

Lemma 1. For any two distinct u,v C, B(m) and B(v) are disjoint sets. 

Proof. Suppose we are working on Gi and there is a node w that is in both B(u) and B(v). Then 
we know d{w,u) < d{ei) and d{w,v) < d{ei). Thus, d{u,v) < 2d{ei). But this contradicts with the 
fact that the distance between every two nodes in G must be larger than 2d{ei). □ 

Theorem 2. Algorithm{l\ produces a 3-approximation for MatCenter. 

Proof. Suppose the maximum radius of any cluster in an optimal solution is r* and a set of optimal 
centers is G* . Consider the algorithm on Gi with d{ei) = r* (r* must be the length of some edge). 
First we claim that there exists an intersection of A4 and A^b of size \G\. In fact, we show there 
is a subset of G* that is such an intersection. For each node u, let a{u) be an optimal center in G* 
that is at most d{ei) away from u. Consider the set S* = {a{u)}uec- Since S* is a subset of C*, 
it is an independent set of Al by the definition of matroid. It is also easy to see that a{u) € B(m) 
for each u (z G. Therefore, S* is also independent in Mb, which proves our claim. Thus, the 
algorithm returns a set S that contains exactly 1 element from each B(f ) with f € C. According to 
the algorithm, for each v (z V there exists u (z G that is at most 2d{ei) away, and this u is within 
distance d{ei) from the (unique) element in B{u) D S. Thus every node of V is within a distance 
3d{ei) = 3r* from some center in 5. □ 
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2.3 Dealing with Outliers: Robust-MatCenter 

We now consider the outlier version of MatCenter, denoted as Robust-MatCenter, in which an ad- 
ditional parameter p is given and the goal is to place centers (which must form an independent 
set) such that after excluding at most — p nodes as outliers, the maximum service cost of any 
node is minimized. For p = \V\, we have the standard MatCenter. In this section, we present a 
7-approximation for Robust-MatCenter. 

Our algorithm bears some similarity to the 3-approximation algorithm for robust A;-center by 
Charikar et al. [2], who also showed that robust fc-center with forbidden centers cannot be approx- 
imated within 3 — e unless P = NP. However, their algorithm for robust /c-center does not directly 
yield any approximation ratio for the forbidden center version. In fact, robust /c-center with for- 
bidden centers is a special case of Robust-MatCenter since forbidden centers can be easily captured 
by a partition matroid. We briefly describe the algorithm in [2]. Assume we have guessed the right 
optimal radius r. For each v ^V, call B(f,r) the disk of v and B(f,3r) the expanded disk of v. 
Repeat the following step k times: Pick an uncovered vertex as a center such that its disk covers 
the most number of uncovered nodes, then mark all nodes in the corresponding expanded disk as 
covered. Using a clever charging argument they showed that at least p nodes can be covered, which 
gives a 3-approximation. However, their algorithm and analysis do not extend to our problem in 
a straightforward manner. The reason is that even if at least p nodes are covered, we may not be 
able to find enough centers in the covered region due to the matroid constraint. In order to remedy 
this issue, we need to search for centers in the entire graph, which also necessitates a more careful 
charging argument to show that we can cover at least p nodes. 

Now we describe our algorithm and prove its performance guarantee. For each 1 < i < ('^'), 
we run Algorithm [2] on the graph Gi defined as before. We need the following simple lemma. 

Lemma 2. Adi is a matroid. 

Proof. It is straightforward to verify that the first and second matroid properties hold. We only 
need to verify the third property. Suppose A and B are two independent sets of A^i and \ A\ > \B\. 
We know the set V{A) (resp., V{B)) of vertices that appear in A (resp., B) is an independent 
set of M. Since \V{A)\ = \A\ and \V{B)\ = \B\, \V{A)\ > \V{B)\. Hence, there is a vertex 
V G V{A) \ V{B) such that V{B) U {v} is independent. We add to B the pair in A that involves v 
and it is easy to see the resulting set is also independent in Ali. □ 

Theorem 3. Algorithm\^ produces a 7-approximation for Robust-MatCenter. 

Proof. Assume the maximum radius of any cluster in an optimal solution is r* and the set of optimal 
centers is C* . For each v G C* , let 0{v) denote the optimal disk B(f,r*). As before, we claim that 
our algorithm succeeds if d{ei) = r*. It suffices to show the existence of an intersection of A^i and 
A^2 with a weight at least p. We next construct such an intersection S' from the optimal center 
set C*. The high level idea is as follows. Let the disk centers in C be vi,V2, . . . ,Vk (according 
to the order that our algorithm chooses them) . Note that vi,V2, . . . ,Vk are the centers chosen by 
the greedy procedure in the first part of the algorithm, but not the centers returned at last. We 
process these centers one by one. Initially, S' is empty. As we process a new center Vj, we may 
add {v, ^{vj)) for some v (z C* to S' . Moreover, we charge each newly covered node in any optimal 
disk to some nearby node in the expanded disk E{vj). (Note that this is the key difference between 
our charging argument and that of [2j; in [2J, a node may be charged to some node far away.) We 
maintain that all nodes in U„gc'*0(t') covered by uj,^-^E(t'j/) are charged after processing vj. Thus, 
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Algorithm 2: Algorithm for Robust-MatCenter on Gj 

1 Initially, set C and mark all vertices in F as uncovered. 

2 while V contains uncovered vertices do 

3 Pick an uncovered vertex v such that B{v,d{ei)) covers the most number of uncovered 
elements. 

4 B{v) ^ B{v,d{ei)). {B{v) is called the disk of v.) 

5 E(f) ^ B{v,3d{ei)) \ U„i=c'E(n). (E(w) is called the expanded disk of v. This definition 
ensures that all expanded disks in {E{u)}ueC are pairwise disjoint.) 

6 C C U {v}. Mark all vertices in E{v) as covered. 

7 end 

8 Create a set U of (vertex, expanded disk) pairs, as follows: For each v (^V and u G C, if 
B{v, d{ei)) n B{u, 3d{ei)) ^ 0, we add {v, E{u)) to U. The weight w{{v, E(«))) of the pair 
iv,Eiu)) is |E(n)|. 

9 Define two matroids Mi and M.2 over U as follows: 

• A subset {{vi,E{ui))} is independent in A^i if all Vj's in the subset are 
distinct and form an independent set in M.. 

• A subset {{vi, E{ui))} is independent in M.2 if all E('Ui)'s in the subset are distinct. 
(It is easy to see M2 is a partition matroid.) 

10 Solve the matroid intersection problem between Mi and M2 optimally (note that the 

independence oracles for Mi and M2 can be easily simulated in polynomial time). Let S be 
an optimal intersection. If w{S) < p, then we declare a failure and try the next Gj. 
Otherwise, we succeed and return V(S) as the set of centers, where 
V{S) = {v I {v, E(n)) G S for some u G C}. 



eventually, all nodes covered by the optimal solution (i.e., Uy^c*0{v)) are charged to the expanded 
disks selected by our algorithm. We also make sure that each node in any expanded disk in S' is 

being charged to at most once. Therefore, the weight of S' is at least | U^^c* 0(t;)| > p. 

Now, we present the details of the construction of S' . If every node in 0{v) for some f G (7* is 
charged, we say 0{v) is entirely charged. Consider the step when we process Vj G C. We distinguish 
the following cases. 

1. Suppose there is a node v G C* such that 0{v) is not entirely charged and 0{v) intersects 
B{vj). Then add {v, E{vj)) to S' (if there are multiple such v^s, we only add one of them). 
We charge the newly covered nodes in U„gc'*0(v) (i.e., the nodes in (U„gc'*0(f)) fl E{vj)) to 
themselves (we call this charging rule I). Note that 0{v) is entirely charged after this step 
since 0{v) C B{vj,3r*). 

2. Suppose B{vj) does not intersect 0{v) for any v £ C*, but there is some node G C* such that 
0{v) is not entirely charged and 0{v) fl E(vj) / 0. Then we add (v, E{vj)) to S' and charge 
all newly covered nodes in 0{v) (i.e., the node in 0{v) fl E{vj)) to B{vj) (we call this charging 
rule II). Since B{vj) covers the most number of uncovered elements when Vj is added, there 
are enough vertices in B{vj) to charge. Obviously, 0(f) is entirely charged after this step. If 
there is some other node u & C* such that 0{u) is not entirely charged and 0{u) n E{vj) ^ 0, 
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then we charge each newly covered node (i.e., nodes in 0{u) PI ^{vj)) in 0{u) to itself using 
rule I. 

3. If E(uj) does not intersect with any optimal disk 0{v) that is not entirely charged, then we 
simply skip this iteration and continue to the next vj. 

It is easy to see that all covered nodes in U„(=c'*0(f) are charged in the process and each node 
is being charged to at most once. Indeed, consider a node u in B{vj). If B{vj) intersects some 
0{v), then u may be charged by rule I and, in this case, no further node can be charged to u 
again. If B{vj) does not intersect any 0{v), then u may be charged by rule II. This also happens 
at most once. It is obvious that in this case, no node can be charged to u using rule I. For a node 
u € E(wj) \ B(vj), it can be charged at most once using rule I. Moreover, by the charging process, all 
nodes in Ut,gc'*0(^) ^-re charged to the nodes in some expanded disks that appear in S'. Therefore, 
the total weight of S is at least p. We can see that each vertex in V{S') is also in C* and appears 
at most one. Therefore, S' is independent in A4i. Clearly, each E(n) appears in S' at most once. 
Hence, S' is also independent in Ai2, which proves our claim. 

Since S is an optimal intersection, we know the expanded disks in S contain at least p nodes. 
By the requirement of Aii, we can guarantee that the set of centers forms an independent set in 
M. For each {v, E(u)) in S, we can see that every node v' in E(n) is within a distance 7d(ej) 
from V, as follows. Suppose u' G B{v,d{ei)) U B{u,3d{ei)) (because B{v,d{ei)) U B{u,3d{ei)) ^ 
for any pair (w, E(u)) G U). By the triangle inequality, d{v',v) < d{v',u) + d{u,u') +d{u',v) < 
3d{ei) + 3d{ei) + d{ei) = 7d{ei). This completes the proof of the theorem. □ 

3 The Knapsack Center Problem 

In this section, we study the KnapCenter problem and its outlier version. Recall that an input of 
KnapCenter consists of a metric space {V,d), m nonnegative weight functions wi, . . . ,Wm on V, 
and m budgets Bi, . . . , Bm- The goal is to select a set of centers 5 C y with Wi{S) < Bi for 
all 1 < i < m, so as to minimize the maximum service cost of any vertex in V. In the outlier 
version of KnapCenter, we are given an additional parameter p < \V\, and the objective is to 
minimize costp{S) := miny/cy.|y/|>p max^gy/ miujg^ (i(u, i), i.e., the maximum service cost of any 
non-outlier node after excluding at most \V\ — p nodes as outliers. 

3.1 Approximability of KnapCenter 

When there is only one knapsack constraint (i.e., m = 1), the problem degenerates to the weighted 
fe-center problem for which a 3-approximation algorithm exists [18]. However, as we show in 
Theorem m the situation changes dramatically even if there are only two knapsack constraints. 

Theorem 4. For any f > 0, if there is an f -approximation algorithm for KnapCenter with two 
knapsack constraints, then P = NP. 

Proof. To prove the theorem, we present a reduction from the partition problem, which is well- 
known to be NP-hard |I2], to the KnapCenter problem with two knapsack constraints. In the 
partition problem, we are given a multiset of positive integers S = {si,S2, . . . ,Sn}, and the goal 
is to decide whether S can be partitioned into two subsets such that the sum of numbers in one 
subset equals the sum of numbers in the other subset. 



10 



Given an instance S = {si, S2, ■ ■ ■ , Sn} of the partition problem, we construct an instance I of 
the KnapCenter problem as follows. The set of clients is y = {ai,bi | 1 < i < n}. The distance 
metric d is defined as d{ai, hi) = for all 1 < i < n, and d(aj, aj) = d(aj, hj) = hj) = 1 for all 
i ^ j. It is easy to verify that d is indeed a metric. Every client in V has a unit demand. There are 
two weight functions wi and W2 specified as follows: for each 1 < z < n, wi{ai) = Si, wi{bi) = 0, 
W2{ai) = 0, and W2{hi) = Si. The two corresponding weight budgets are Bi = B2 = where 
T = X]j=i ^j- This finishes the construction of X. 

We show that S can be partitioned into two subsets of equal sum if and only if X has a 
solution of cost 0. First consider the "if" direction. Assume that X admits a solution of cost 0. 
Clearly, for each 1 < i < n, the solution must take at least one of {aj,6j} as a center, and we 
assume w.l.o.g. that it takes exactly one of a, and bi (just choosing an arbitrary one if both are 
taken). Let Ii be the set of indices i for which aj is taken as a center in the solution. Then 
I2 = {1, 2, . . . , n} \ Ji consists of all indices i for which hi is taken by the solution. Considering the 
first weight constraint, we have T/2 = Bi> Yli^i^^ wi{ai) + J^ieh ^i(^j) = X^ie/i Similarly, by 
the second weight constraint, we get T/2 > Y^i^zj^ Si. Since Y^ieh + ^ieh ^ Ya=i = 
holds that J^ieh — X]je/2 ~ T/2. Therefore, S can be partitioned into two subsets of equal 
sum. 

We next prove the "only if" part. Suppose there exists 7i C {1, 2, . . . , n} such that Ylieh ~ 
T/2. In the instance X, we take T := {oj | i G /i} U {hj \ j G {1,2, ...,n} \ Ji} as the set 
of centers. It only remains to show that T satisfies both the weight constraints, which is easy to 

verify: E,;er ^i(^) = Eie/i ^» = ^/^ ^ ■^i' and Et,gr "^2(1^) = Eje{i,2,...,n}\/i = T-T^jeh *i = 
T/2 < B2. This proves the "only if" direction. 

Since the optimal objective value of I is 0, any /-approximate solution is in fact an optimal one. 
Hence, if KnapCenter with two constraints and unit demands allows an /-approximation algorithm 
for any / > 0, then the partition problem can be solved in polynomial time, which implies P = NP. 
The proof of Theorem |4] is thus complete. □ 

It is then natural to ask whether a constant factor approximation can be obtained if the con- 
straints can be relaxed slightly. We show in Theorem [5] that this is achievable (even for the demand 
version) . Before proving the theorem we first present some high-level ideas of our algorithm, shown 
as Algorithm [3l The algorithm first guesses the optimal cost OPT, and then chooses a collection 
of disjoint disks of radius OPT according to some rules. It can be shown that there exists a set 
of centers consisting of exactly one point from each disk that gives a 3-approximate solution and 
satisfies all the knapsack constraints. We then reduce the remaining task to another problem called 
the group multi-knapsack problem, which will formally be defined in the following proof. 

Theorem 5. For any fixed e > 0, there is a 3- approximation algorithm for KnapCenter with a 
constant number of knapsack constraints, which is guaranteed to satisfy one constraint and violate 
each of the others by at most a factor of 1 + e. 

In what follows we prove Theorem [5j We first present our algorithm for KnapCenter in Algo- 
rithm [3] that we use to prove Theorem [5l The algorithm works for the more general version where 
each vertex v has a demand r{v) and the service cost of v is minims r{v)d{v,i) when taking S as 
the set of centers. 

Given an instance of the KnapCenter problem, suppose Algorithm [3] correctly guesses the optimal 
objective value OPT. (This can be equivalently realized by running the algorithm for all ('p) 
possibilities and taking the best solution among all the candidates.) The algorithm greedily finds 
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Algorithm 3: Algorithm for KnapCenter with multiple constraints 

1 Guess the optimal objective value OPT. 

2 For each client v eV, let B(t;) ^ B{v, OPT) be the disk of v. Let T ^ 0. 

3 while there exists i (zV such that B(i) n B(j) = for all j (z T do 

4 Choose such an i with maximum demand, and let T ^ T^J {i}. 

5 end 

6 Create an instance X of the group multi-knapsack problem as 

X = ({B(i)}jg7-, {wj,Bj}i<j<rn) (recall that m = 0(1)), and get a solution S by applying the 
algorithm indicated by Lemma [3l 

7 return S 



a collection of mutually disjoint disks {B(i)}jg7-, and then constructs a set of centers by selecting 
exactly one point from each disk using some algorithm for the group multi-knapsack problem, which 
we will define later. 

Call a set S standard if S consists of exactly one point from each of the disks {B(i)}jg7-. 
We first show that there exists a standard set S such that Wj{S) < Bj for all 1 < j < m, i.e., S 
fulfills all the knapsack constraints. Suppose O V is the set of centers opened in some optimal 
solution. Then, for each i € T, there exists j & O such that r{i)d{i,j) < OPT, and thus j G B(i). 
Hence, we can choose from each B(i) exactly one point that belongs to O, and these points are 
distinct because the disks are pairwise disjoint. Let S denote the set of these points. Clearly, S is 
a standard and is a subset of O, and thus Wj{S) < Wj{0) < Bj for all 1 < j < m. This proves the 
existence of a standard set that satisfies all the knapsack constraints. 

We will reduce the remaining task to another problem called the group multi-knapsack problem, 
which we define as follows. Suppose we are given a collection of pairwise disjoint sets {5j}i<j<„. 
Let S = UiLi'^*- some fixed integer m > 1, there are m nonnegative weight functions defined 
on the items of 5, which we denote hy wi, . . . , Wm, and m weight limits Bi, . . . , Bm- A solution is 
a subset <S' C 5 that consists of exactly one element from each of the n sets iSi , . . . , iS„ . The goal 
is find a solution S' such that Wj{S') < Bj for all 1 < j < m, provided that such solution exists. 
For our purpose, we require the number of constraints to be a constant. This problem is new to 
our knowledge, and may be useful in other applications. By Lemma [3] (which will be presented and 
proved later), we can find in polynomial time a solution that satisfies one constraint and violates 
each of the others by a small factor. 

Now come back to the KnapCenter problem. By Lemma [3l line 6 of Algorithm [3] produces in 
polynomial time a standard set S that satisfies one constraint and violates each of the others by 
a factor of at most 1 + e. (We notice that, when running Algorithm [3] with an incorrect value of 
OPT, there may not exist any standard set, in which case the algorithm may return an empty set. 
We shall simply ignore such solutions.) 

It now only remains to show that, by designating S as the set of centers, the maximum service 
cost of any client is at most 3 • OPT. Suppose S n B(i) = {ti} for each i £ T- It suffices to prove 
that, for each j € V, there exists i such that r{j)d{j,ti) < 3 • OPT. We consider two cases. 

1. j e T. Since tj G B(j), we have r{j)d{j,tj) < OPT < 30PT by the definition of B(j). 

2. j T. Then B{j) n B{i) / for some i G T, otherwise j should be added to T by the 
algorithm. Let Q = {i G T \ B(i) n B(j) ^ 0}. If r{i) < r(j) for all i G Q, then the algorithm 
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Algorithm 4: Algorithm for Robust-KnapCenter 

1 Guess the optimal objective value OPT. 

2 For each v(^V, let B(z;) ^ B(z;,OPT) and ^{v) ^ B(7;,30PT). 

3 5 ^ 0; C (the points in C are covered and those laV \ C are uncovered). 

4 while w{S) <B and V\C do 

5 Choose i \ S that maximizes ^^1-*^^ . 

6 S SU {i}; C ^ C U E(z) (i.e., mark all uncovered points in E(i) as covered). 

7 end 

8 return 5 



will choose j before choosing all i G Q, which contradicts with the assumption that j T. 
Thus, there exists z G Q for which r{i) > r{j). Consider this particular i, and choose an 
arbitrary i' £ B{i) n B{j). We have 



Combining the two cases, we have shown that the service cost with centers in S is at most three 
times the optimal cost, which completes the proof. 

Finally, we need the following Lemma [3l which is used in the above argument. The group 
multi- knapsack problem is similar to the multiple knapsack problem (i.e., the knapsack problem 
with multiple resource constraints), and the (standard) technique for the latter can be easily adapted 
to solve the group multi-knapsack problem (see, e.g., \28\ I19j). Another way to deduce Lemma [3] is 
by applying the e-approximate Pareto curve method introduced by Papadimitriou and Yannakakis 
[27j . For sake of completeness, we give a proof of Lemma [3] in Appendix lAl 

Lemma 3. For any fixed e > 0, there is a polynomial time algorithm that, given an instance 
of group multi-knapsack for which a solution satisfying all weight constraints exists, constructs in 
polynomial time a solution that satisfies one constraint and violates each of the others by at most 
a factor of 1 + e. 

3.2 Dealing with Outliers: Robust-KnapCenter 

We now study Robust-KnapCenter, the outlier version of KnapCenter. Here we consider the case 
with one knapsack constraint (with weight function w and budget B) and unit demand. Our main 
theorem is as follows. 

Theorem 6. There is a 3- approximation algorithm for Robust-KnapCenter that violates the knap- 
sack constraint by at most a factor of 1 -\- e for any fixed e > 0. 

We present our algorithm for Robust-KnapCenter as Algorithm HI We assume that B < w{V), 
since otherwise the problem is trivial. We also set A/0 := oo for A > and 0/0 := 0, which 



r{j)dij,ti) < 



< 



< 



r{j){d{j, i') -\- d{i, i') + d{i, ti)) (triangle inequality) 
r{j)d{j, i') + r{i)d{i, i') -\- r{i)d{i, ti) (because r(i) > r(j)) 
OPT OPT -h OPT (due to the definition of disks) 
3 • OPT. 
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makes line 5 work even if w{i) = 0. Our algorithm can be regarded as a "weighted" version of 
that of Charikar et al. |2j, but the analysis is much more involved. We next prove the following 
theorem, which can be used together with the partial enumeration technique to yield Theorem [6j 
Note that, if all clients have unit weight, Theorem [7] will guarantee a 3- approximate solution S with 
w{S) < B + 1, which implies w{S) < B. So it actually gives a 3-approximation without violating 
the constraint. Thus, our result generalizes that of Charikar et al. |2]. 

Theorem 7. Given an input of the Robust-KnapCenter problem, Algorithm^ returns a set S with 
w{S) < B + maxy^v wiv) such that costp{S) < 30PT. 

Proof. We call B{v) the disk of v and E(u) the expanded disk of v. Assume w.l.o.g. that the 
algorithm returns S = {1,2,... ,q} where q = \S\, and that the centers are chosen in the order 
1,2, ... ,q. We first observe that B(l), . . . , B{q) are pairwise disjoint, which can be seen as follows. 
By standard use of the triangle inequality, we have B(i) C E(j) and B(j) C E{i) for any i,j £ V 
such that B{i) n B(j) ^ 0. Therefore, if there exists 1 < i < j < q such that B(j) n B{i) ^ 0, then 
all points in B(j) are marked "covered" when choosing i, and hence choosing j cannot cover any 
more point, contradicting with the way in which the centers are chosen (note that the algorithm 
terminates when all points have been covered). So the q disks B(l), . . . , B(g) are pairwise disjoint. 

For ease of notation, let B{V') := [j^^y, B{v) and E{V') := U^,gy' E(?;) for V C V. By 
the condition of the WHILE loop, w{{l, ...,q - I}) < B, and thus w{S) < B + w{q) < B + 
maxy^v w{v). It remains to prove costp{S) < 30PT. Note that this clearly holds if the expanded 
disks E(l), . . . , E{q) together cover at least p points. Thus, it suffices to show that |E(5)| > p. If 
w{S) < B, then all points in V are covered by E(5) due to the termination condition of the WHILE 
loop, and thus |E(iS)| = \V\ > p. In the rest of the proof, we deal with the case w{S) > B. 

For each v V, let f{v) be the minimum i £ S such that B{v) fl B(i) ^ 0; let f{v) = oo if no 
such i exists (i.e., if disk B{v) is disjoint from all disks centered in S). Suppose O = {oi, 02, . . . , Om} 
is an optimal solution, in which the centers are ordered such that f{oi) < • • • < f{om)- Since the 
optimal solution is also feasible, we have |B(0)| > p. Hence, to prove |E(5)| > p, we only need to 
show |E(5)| > \B{0)\. For any sets A and B, we have |^| = \A\B\ + \Ar\ B\. Therefore, 

|E(5)|-|B(0)| 

= (|E(5) \ B{0)\ + |E(5) n B(O)I) - (!B(0) \ E(5)[ + |E(5) n B(0)|) 
= !E(5)\B(0)|-|B(0)\E(5)| 

> IB(5) \ B(0)1 - |B(C) \ E(5)j (because B(5) C E(5)). (1) 
As B(l), . . . , B{q) are pairwise disjoint, 

\B{S) \ B{0)\ = I U,e5 im \ B(0))I = E \ 

and 

m 

\B{0) \ E(5)| = I U- 1 (B(o,) \ E(5))| < ^ |B(o,) \ E(5)|. 

i=i 

Thus, 

m 

|E(5)| - |B(0)| > \B{i) \ B{0)\ - |B(o,) \ E(5)|. (2) 

ies j=i 
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Figure 2: An example of the algorithm for Robust-KnapCenter. The algorithm returns S = {1,2,3}, and 
the optimal solution opens {oi, 02, . . . , og}. Disks and extended disks are represented by (small) circles 
and (large) dashed circles, respectively. In this case, we have /(oi) — /(02) — 1,7(03) — /(04) — 2, 
/(05) = /(og) = 00, and thus t = 5. Then, R(l) = {1, 2}, R(2) = {3, 4}, and R(3) = 0. 

Let t be the unique integer in {1, . . . ,m + 1} such that f{oj) < \S\ for all 1 < j < t — 1 and 
f{oj) = 00 for all t < j < m. (That is, each disk B(oj) (1 < j < t — 1) intersects with B{i) for some 
i (z S, while the remaining B(ot), . . . , B{om) are disjoint from all the disks of points in S. Such t 
exists because /(oi) < • • • < f{om)- See Figure[2]for an example.) Then, for all 1 < j < t — 1, we 
have B(oj) n B(/(oj)) ^ 0, and thus B(oj) C E(/(oj)) C E(5), implying that |B(oj) \ E(5)| = for 
all 1 < j < t — 1. Combining with the inequality (l2|), we have 

m 

|E(5)| - |B(e))I > \m \ B(0)I - IB('^i) \ ^(5)1. (3) 

ies j=t 

Hence, it suffices to prove that 

m 

5^|B(i)\B(O)|-5]|B(o,)\E(5)|>0. (4) 

ies j=t 

The inequality is trivial when t = m + 1. Thus, we assume in what follows that t < m, i.e., 
B(om) is disjoint from B(l), B(2), . . . , B{q). Before proving we introduce some notations. For 
each i G 5, define R{i) := {i 1 1 < j < m;/(oj) = i}, and let := min{j | j € R{i)} and 
q{i) := max{j | j € R(i)} be the minimum index and maximum index in R(i), respectively (let 

= q{i) = CX3 if R{i) = 0). By the definitions of /(•) and t, each R(i) is a set of consecutive 
integers (or empty), and {R(i)}jg5 forms a partition of {1, 2, . . . , t — 1}. Also, q{i) = /(i + 1) — 1 if 
l{i + 1) ^ 00. See Figure [2] for an illustration of the notations. 

Consider an arbitrary i ^ S. For each j such that l{i + 1) < j < t — 1, we know that j € R(^') 
for some i' > i, i.e., f{oj) = i' > i, and thus B(oj) fi B{i) = 0. By the definition of t, we also have 
B{oj) n B(i) = for all t < j < m. Therefore, 

B{oj) n B{i) = for all j s.t. min{t, l{i + 1)} < j < m. (5) 

(Here we take the minimum of /(z + 1) and t because l{i -\- 1) may be 00.) 

We next try to lower-bound \B{i) \ B(0)| in order to establish (j4]). Equality ([5]) tells us that 
B(oj) n B{i) 7^ implies j € R(l) U • • • U R(«). In consequence, 

B{i) \ B{0) = B(0 \ Uf^Moj) = \ U,6R(i)u...uR»B(o,). (6) 
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For eachj G R(i') with 1 < i' <i-l, B(oj)nB(i') 7^ 0, and thus B(oj) C E(i') C E({1, 2, . . . , i- 1}). 
For convenience, define E<j := E({1,2, ... ,i — 1}). Then, from ([6]) we get B{i) \ B(0) ^ B(i) \ 
(E<i U UjeR(i) B(oj)), and hence 

|B(i)\B(e))|>|B(i)\(E<,U U B{oj))\ 

ieR(i) 

= |B(i)\(E<,U U (B(o,)\E<i))| (because yluU^* = ^uU(5,\^)) 

jeR{i) i i 

= |(B(.)\E<,)\ U (B(o,)\E<,)|>|B(i)\E<,l- ^ |B(o,)\E<,l. (7) 
jGR(i) ieR(i) 

Now consider the particular execution of fine 5 in which i is chosen and added to S. Note that 
([5]) holds for all i G 5. Thus, for all 1 < i' < i — 1 and min{f, + 1)} < j < m, B{oj) is disjoint 
from B(i'), which in particular implies oj B(i'). By considering all i' £ {1, . . . ,i — 1} and noting 
that /(i) > + 1), we have oj ^ B({1,2,... ,i — 1}) for all min{i,/(i)} < j < m. This further 
indicates that {1, 2, . . . , i — 1} n {oj \ min{t, < j < m} = 0. Recall that 1, 2, . . . , z — 1 are all 
the points added to S before i. Therefore, no point in {oj \ mm{t,l{i)} < j < m} was chosen 
before i. By our way of choosing centers (see line 5), we have 

\m \ E<i| ^ |B(oi)\E<.| . ^^^^ <j<^_ (8) 



w{i) w{oj) 
Hence, for all j € R(i), 

|B(o,-)\E<.I<^|B(i)\E<.|. 
Substituting the above inequality into d?]) gives 

|B(i)\B(0)| > |B(i)\E<,|- ^|B(.)\E<,| 

ieR(i) ^ ^ 



'l- ^^^"'?'"'°'' )|B(.)\E<.|. (9) 

u;(z) / 



By ([8]) we also have 



iRruF i> |B(o,)\E<,| ^ Er=JB(o,)\E 

|B(ij \ E<j| > w{i) • max — > w[i) 



t<j<m w{Oj) Ilj=tWiOj) 

where we use the inequality maxj > ^ when Bj > for all j. Plugging this inequality into 



and noting that E<j C E(5), we obtain: 

_ ^^(0-E,.R,)^(o.) ViB. E I 



m 



> ^^^4^^^ 5:|B(o,)\E(5)|. (10) 



16 



Applying ([T0|) for all i G S and summing the resulting inequalities up, we get 




(11) 



where the last equality holds because {R(i)}jg5 is a partition of {1, 2, . . . , t — 1}. 

Recall that we are dealing with the case of w{S) > B. Since O is an optimal solution meeting 
the weight constraint, wi^O) = X]Jliif(oj) <B < w{S). Therefore, by (fTT]l we have 



At the end of this section, we prove Theorem [6] using Theorem [7| and the partial enumeration 
technique. Fix a parameter e > 0. Given an instance X of Robust-KnapCenter, call a point v gV 
heavy if w{v) > e ■ B. Let O CV he the set of centers taken by the optimal solution of I (without 
violating the knapsack constraint), and H be the set of heavy centers in O. Let OPT denote 
the optimum objective value. Clearly, \T-l\ < B/{e ■ B) = 1/e. We guess the elements of % by 
trying all possible cases (at most = [yp'-"'^^ possibilities) and using the best solution. We 

then construct a new instance X' of Robust-KnapCenter as follows: the metric space is the same 
as that of X, the weight function w' is defined as w'{v) = for v G T-i and w^v) = w{v) for 
V € V Xl-L, and the weight budget is B' = B — w{7i). It is easy to see that opening O in X' gives 
a feasible solution of cost OPT. Note that the maximum weight of any point in X' is at most e • B. 
Hence, by Theorem [71 we can find in polynomial time a solution S such that cost{S) < 30PT and 
w'{S) < B—w{'H) + e-B. We use S as our solution to the original instance X. Then, cost{S) < 30PT 
and w{S) < w'{S) + ■w{'H) < (1 + e)B. The proof is complete. 

4 Concluding Remarks and Open Problems 

We gave a 3-approximation algorithm for MatCenter and the best known inapproximability bound 
is 2 — e. For Robust-MatCenter, we give a 7-approximation while the current best known lower bound 
is 3 — e due to the hardness of robust A;-center with forbidden centers [2| . It would be interesting 
to close these gaps. (Note that MatCenter includes as a special case the /c-center problem with 
forbidden centers, i.e., some points are not allowed to be chosen as centers. It is known that 
another generalization of the latter, namely the A;-supplier problem, is NP-hard to approximate 
within 3 — e [18].) For Robust-KnapCenter, it is interesting to explore whether constant factor 
approximation exists while not violating the knapsack constraint. It is also open whether there is 
a constant factor approximation for the demand version (even for the unit-weight case). Finally, 
extending our results for Robust-KnapCenter to the multi-constraint case seems intriguing and may 
require essentially different ideas. 




•^|B(o,-)\E(5)| = j;|B(o,)\E(5)| 



j=t j=t 



m m 



which immediately gives (jl]) . This completes the proof of Theorem [71 



□ 
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A Proof of Lemma [3] 

Let I = ({5j}i<i<„, {'Wj,Bj}i<j<rn) be an instance of the group multi-knapsack problem, for which 
there exists a solution satisfying all the weight constraints; we will call such a solution good. Let 
S = Ur=i'^* w^max = niax^g5;i<j<m )• When m = 1, we can simply choose from each Si 
the element v £ Si with the smallest 'Wi{v). In what follows, we assume m > 2. If there exists 
1 < j < m such that tUmax > Sj, then the element having the weight t^max cannot appear in any 
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good solution, and we will modify the instance by removing it from <S. Hence, we also assume that 
Wraax < for all 1 < j < ni. 

We apply the scaling technique that has been widely used in the design of PTASs for knapsack- 
like problems. Fix e > 0, and define A := e ■ Wmax/n. For each v E S, define 

Wj{v) = lwj{v)/A\ for all 1 < j < m — 1, and w'^{v) = Wm{v). 

Also define 

Bj = min{ [Bj j A\ , \r? Ie\ } for all 1 < j < m - 1, and B'^ = Bm- 

(The choice of the "special" index m can be arbitrary; it indicates the constraint that wc wish to 
satisfy.) We have w'j{v) € {0, 1, ... , K} for all 1 < j < m — 1 and v & S, where K = [^max/^J = 
[n/ej. Create a new instance I' = {{Si}i<i<n,{Wj,Bj}i<j<rn))- For the original instance I, we 
know that there exists a good solution T QS. Using the inequality [aj + [6J < [a + 6J , we obtain 
that for each 1 < j < m — 1, 

veT 

< min{ Wj {v)/A\ , n • [n/eJ } 

< mm{[Bj/Aj,[^/ej} 
= B'- 

Also, w'j^{T) = Wm{T) < B„i = B'„i. Therefore, T is also a good solution of Z'. For i £ {1,2, ... , n}, 
a subset T C 5 is called i- standard if T consists of exactly one element from each of the i sets 
<Si,<S2, . . . Thus a solution of X' is just an n-standard subset, and vice versa. For each 

tuple {i,pi,p2, ■ . . ,Pm-i) where i G {l,...,n} and (VI < j < m — l)pj G {0,1, . . . ,Bj}, let 
F{i,pi,p2, . . . ,Pm-i) denote the minimum possible value of pm for which there exists an z-standard 
subset T such that Wj{T) < pj for all 1 < j < m, and let T{i,pi,P2, ■ ■ ■ ,Pm-i) be an (arbitrary) 
such z-standard subset. If such does not exist, then we let F{i,pi,p2, . . . ,Pm-i) = oo and 
T{i,pi,P2, ■ ■ ■ ,Pm-i) = 0- Since I' admits a good solution, it is easy to see that 

Fin,B[,B'„...,B'^_,)<B'^. 

Our goal is thus to find T(n, B'l, . . . , B'^^i). Note that the number of tuples {i,pi, ■ ■ ■ ,Pm-i) is at 
most n ■ YVjl^i ^ n(n^/e)"^~^ = n'-'^^\ since m and e are both constants. 

We now compute all F{i,pi,p2, . . . ,Pm-i) and find the corresponding z-standard subsets by 
dynamic programming. The base case is f = 1. For each tuple {l,pi,p2, . . . ,Pm-i), let TZ = {v E 
'Si I (VI < j < m — l)'Wj{v) < Pj}. If 7?, 7^ 0, then clearly F{l,pi, . . . ,pm-i) = rain^^n Wm{v) , and 
we set T(l,pi, . . . ,Pm-i) to be the vertex v gTZ that achieves the minimum Wm{v). If 7^ = 0, then 
F(l,pi, . . . ,Pm-i) = oo and T(l,pi, . . . ,Pm-i) = 0- 

Next we derive the transition function for computing F{i,pi,p2, . . . ,Pm-i) for i > 2. We 
enumerate all possible v E Si that may belong to T(i,pi, . . . ,Pm-i)- Then, it is easy to see that 

F{i,pi, . . . ,pm-i) = mm{wm{v) + F{i - - wi{v),p2 - W2{v), . . . ,Pm-i - Wm-i{v))}. 
(We assume F{i',p[, . . . ,p^_i) = oo if p^- < for some j.) 
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If F{i,pi, . . . ,pm-i) = oo, then we let T{i,pi, ■ ■ ■ ,Pm~i) = 0; otherwise, assummg the minimum 
value is attained at v (z Si, we set 

T{i,pi, . . . ,Pm-l) = {v} U T{i -l,pi- Wl{v), . . . ,Pm-l - Wm-liv)). 

In this way, we can correctly compute the values of every F{i,pi, . . . ,pm-i) and find the set 
T{i,pi, . . . ,Pm-i) witnessing the value. Since there are only n^^^^ tuples and the time spent on 
each tuple is polynomial in the number of elements, the computation can be done in polynomial 
time. 

As argued before, T = T{n, B'i,B'2, ■ ■ ■ , -Bm-i) ^ good solution to I' , provided that the original 
instance I has a good solution. Now we take T as our solution to I. (We note that, if the original 
instance X is not guaranteed to have a good solution, then we may have F{n, B'l, . . . , B'^_i) > B'„^, 
in which case we will simply return an empty set. This can happen when Algorithm [3] is executed 
with an incorrect value of OPT.) We have Wm{T) = w'miT) < B'^ = Bm- For each 1 < j < m — 1, 
w'j{v) = [_'Wj{v)/A\ > Wj{v)/A — 1, and thus we have 

^Wj{v) < ^{A-w'j{v) + A) = A-^w'j{v)+nA 

< A-B'j + n-e-Wraa.^/n 

< Bj + e- UJmax 

< (1 + €)Bj (since Wms,^ < Bj). 

Therefore, 7~ is a solution of X that satisfies one of the constraints and violates the others by at 
most a factor of 1 + e. (It is easy to see that, by modifying the definitions of {w'j} and {B'j}, we can 
make any one of the constraints to be the satisfied one.) The proof of Lemma [3] is thus complete. 
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